Improved Bounds and Schemes for the Declustering Problem
نویسندگان
چکیده
The declustering problem is to allocate given data on parallel working storage devices in such a manner that typical requests find their data evenly distributed on the devices. Using deep results from discrepancy theory, we improve previous work of several authors concerning range queries to higher-dimensional data. We give a declustering scheme with an additive error of Od(log d−1 M) independent of the data size, where d is the dimension, M the number of storage devices and d − 1 does not exceed the smallest prime power in the canonical decomposition of M into prime powers. In particular, our schemes work for arbitrary M in dimensions two and three. For general d, they work for all M ≥ d− 1 that are powers of two. Concerning lower bounds, we show that a recent proof of a Ωd(log d−1 2 M) bound contains an error. We close the gap in the proof and thus establish the bound. supported by the DFG-Graduiertenkolleg 357 “Effiziente Algorithmen und Mehrskalenmethoden”. Max–Planck–Institut für Informatik, Saarbrücken, Germany. Max–Planck–Institut für Informatik, Saarbrücken, Germany. Institut für Informatik und Praktische Mathematik, Christian-Albrechts-Universität zu Kiel, Germany
منابع مشابه
A Hierarchical Technique for Constructing Efficient Declustering Schemes for Range Queries
Multi-disk systems, coupled with declustering schemes, have been widely used in various applications to improve I/O performance by enabling parallel disk accesses. A declustering scheme determines how data blocks should be placed among multiple disks to maximize the parallelism. We focus on the problem of declustering grid-structured multidimensional data with the objective of reducing the resp...
متن کاملDeclustering Using Golden Ratio Sequences
In this paper we propose a new data declustering scheme for range queries. Our scheme is based on Golden Ratio Sequences (GRS), which have found applications in broadcast disks, hashing, packet routing, etc. We show by analysis and simulation that GRS is nearly the best possible scheme for 2-dimensional range queries. Speciically, it is the best possible scheme when the number of disks (M) is a...
متن کاملA Novel B and B Algorithm for a Unrelated Parallel Machine Scheduling Problem to Minimize the Total Weighted Tardiness
This paper presents a scheduling problem with unrelated parallel machines and sequencedependent setup times that minimizes the total weighted tardiness. A new branch-and-bound (B and B) algorithm is designed incorporating the lower and upper bounding schemes and several dominance properties. The lower and upper bounds are derived through an assignment problem and the composite dispatching rule ...
متن کاملIterative-improvement-based declustering heuristics for multi-disk databases
Data declustering is an important issue for reducing query response times in multi-disk database systems. In this paper, we propose a declustering method that utilizes the available information on query distribution, data distribution, data-item sizes, and disk capacity constraints. The proposed method exploits the natural correspondence between a data set with a given query distribution and a ...
متن کاملThreshold-based declustering
Declustering techniques reduce query response time through parallel I/O by distributing data among multiple devices. Except for a few cases it is not possible to find declustering schemes that are optimal for all spatial range queries. As a result of this, most of the research on declustering has focused on finding schemes with low worst case additive error. However, additive error based scheme...
متن کامل